Upgrade transformers==5.3.0#17784
Conversation
…ple files - Updated `huggingface_hub` dependency to version `>=1.0.0` in `pyproject_cpu.toml`, `pyproject_npu.toml`, `pyproject_other.toml`, `pyproject_xpu.toml`, and `pyproject.toml`. - Upgraded `transformers` dependency to version `5.0.0` in the same files. - Removed `hf_transfer` from the dependencies in the aforementioned files. - Refactored the handling of `rope_theta` and `rope_scaling` parameters to use `config.rope_parameters` in various model files for consistency and improved maintainability. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Summary of ChangesHello @JustinTong0323, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request primarily focuses on upgrading core dependencies, most notably the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request primarily focuses on updating the transformers library to version 5.0.0 and adapting the codebase to changes introduced in this new version. Key modifications include updating huggingface_hub and transformers dependencies, removing the hf_transfer dependency and its associated code, and refactoring the access pattern for rope_theta and rope_scaling parameters across various model files to use config.rope_parameters.get(...). Additionally, transformers API calls in test files have been updated to reflect changes in class names and output access methods. These changes are well-justified and necessary for compatibility with the upgraded transformers library, improving overall code maintainability and consistency.
…les for cleaner dependency management. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
/tag-and-rerun-ci run again again |
…files for v5 compatibility. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
The # Wrap each token ID in its own list for batch_decode to decode them separately
# batch_decode([1, 2, 3]) concatenates tokens, batch_decode([[1], [2], [3]]) decodes separately
token_texts = self.tokenizer.batch_decode([[idx] for idx in token_logprobs_idx])at least I had to change this to make it run with sglang after manually upgrading |
Fix tokenizer behavior in auto mode to ensure compatibility with Transformers v5 by explicitly setting use_fast=True when not provided.
…lang into update-transformers-v5
|
stage-b-test-small-1-gpu (5) test_embedding_models.py passed on main
|
…asses The _fix_v5_add_bos_eos_token function was blindly restoring add_eos_token from tokenizer_config.json for all models, but Qwen2Tokenizer did not support this flag in v4. This caused gte-Qwen2-1.5B-instruct to add an unexpected EOS token, breaking embedding similarity tests. Changes: - Only restore BOS/EOS flags for tokenizer classes that supported them in v4 (LlamaTokenizer, GemmaTokenizer, etc.) - Use v4 defaults (add_bos_token=True) when config value is null/missing to prevent update_post_processor() from dropping BOS - Apply the same tokenizer fix to sentence-transformers in the HF test runner so HF reference and SRT produce matching tokens
…duce dead code - Extract compute_mla_mscale_scaling() to replace 4 copy-pasted rope scaling blocks in model_config.py and deepseek_v2.py - Extract _resolve_local_or_cached_file() to deduplicate local-path-then -hf_hub_download pattern across 3 tokenizer/processor fix functions - Extract ensure_numpy() in mm_utils.py to deduplicate torch.Tensor to numpy conversion in mm_utils.py and llava.py - Restructure get_hf_text_config() as proper elif chain with documented priority (thinker > llm > language > text), fixing assert that fired on text_config even when llm_config would override it - Add thinker_config to early dict-to-PretrainedConfig conversion loop - Remove dead dict-conversion branch in _patch_text_config (already handled by early loop in get_hf_text_config) - Use from_dict instead of from_pretrained in get_config KeyError handler to avoid redundant disk read - Add warning log for AutoImageProcessor failures in _build_processor_manually instead of silent swallow
- Fix operator precedence in is_deepseek_nsa: extract index_topk to
separate variable for clarity
- Fix llama_eagle3 wrong nested key: rope_parameters["rope_type"]
instead of rope_parameters["rope_scaling"]["rope_type"]
- Fix midashenglm same nested key bug: directly delete mrope_section
from rope_parameters instead of writing to nonexistent nested key
- Fix gemma3n_causal _tied_weights_keys: use v5 dict format
{target: source} instead of v4 list format
- Add debug logging to broad except clauses in hf_transformers_utils
- Narrow _ensure_llama_flash_attention2_compat exception to ImportError
- Improve batch_decode comment in tokenizer_manager
|
@JustinTong0323 Please let me know if you need help with fixing the issues after transformers version upgrade. Thanks! |
Thanks! I think most of the work is done and we only need to pass the CI now. |
This reverts commit d1e95af.
…ccuracy threshold PR sgl-project#17784 (transformers 5.3.0 upgrade) changed grok.py to access config.rope_parameters["rope_theta"] directly, but GitConfig (grok-2) does not have this attribute, crashing the server on startup with AttributeError: 'GitConfig' object has no attribute 'rope_parameters'. Restore safe access via getattr with fallback, matching the pattern used elsewhere in the codebase. Also lower the MI325 Grok-2 GSM8K accuracy threshold from 0.915 to 0.90 to match the MI35x test, since nightly sgl-project#636 showed 0.910 which is within normal run-to-run variance.
|
Great job. |
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com> Co-authored-by: Alison Shao <alisonshao@mac.lan> Co-authored-by: Mick <mickjagger19@icloud.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com> Co-authored-by: Alison Shao <alisonshao@mac.lan> Co-authored-by: Mick <mickjagger19@icloud.com>
|
K2.5 failed with new transformers==5.3.0: Traceback (most recent call last):
File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.10/site-packages/sglang/launch_server.py", line 68, in <module>
run_server(server_args)
File "/opt/conda/lib/python3.10/site-packages/sglang/launch_server.py", line 52, in run_server
launch_server(server_args)
File "/opt/conda/lib/python3.10/site-packages/sglang/srt/entrypoints/http_server.py", line 2235, in launch_server
Engine._launch_subprocesses(
File "/opt/conda/lib/python3.10/site-packages/sglang/srt/entrypoints/engine.py", line 681, in _launch_subprocesses
tokenizer_manager, template_manager = init_tokenizer_manager_func(
File "/opt/conda/lib/python3.10/site-packages/sglang/srt/entrypoints/engine.py", line 131, in init_tokenizer_manager
tokenizer_manager = TokenizerManagerClass(server_args, port_args)
File "/opt/conda/lib/python3.10/site-packages/sglang/srt/managers/tokenizer_manager.py", line 269, in __init__
self.init_tokenizer_and_processor()
File "/opt/conda/lib/python3.10/site-packages/sglang/srt/managers/tokenizer_manager.py", line 335, in init_tokenizer_and_processor
_processor = _get_processor_wrapper(server_args)
File "/opt/conda/lib/python3.10/site-packages/sglang/srt/managers/tokenizer_manager.py", line 3015, in _get_processor_wrapper
processor = get_processor(
File "/opt/conda/lib/python3.10/site-packages/sglang/srt/utils/hf_transformers_utils.py", line 1180, in get_processor
processor = AutoProcessor.from_pretrained(
File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/processing_auto.py", line 407, in from_pretrained
return processor_class.from_pretrained(
File "/opt/conda/lib/python3.10/site-packages/transformers/processing_utils.py", line 1403, in from_pretrained
args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, processor_dict, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/processing_utils.py", line 1517, in _get_arguments_from_pretrained
tokenizer = cls._load_tokenizer_from_pretrained(
File "/opt/conda/lib/python3.10/site-packages/transformers/processing_utils.py", line 1464, in _load_tokenizer_from_pretrained
tokenizer = auto_processor_class.from_pretrained(
File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 732, in from_pretrained
tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 583, in get_class_from_dynamic_module
return get_class_in_module(class_name, final_module, force_reload=force_download)
File "/opt/conda/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 309, in get_class_in_module
module_spec.loader.exec_module(module)
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/root/.cache/huggingface/modules/transformers_modules/tokenization_kimi.py", line 11, in <module>
from transformers.models.gpt2.tokenization_gpt2 import bytes_to_unicode
ImportError: cannot import name 'bytes_to_unicode' from 'transformers.models.gpt2.tokenization_gpt2' (/opt/conda/lib/python3.10/site-packages/transformers/models/gpt2/tokenization_gpt2.py) |
update: solved by updating |
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com> Co-authored-by: Alison Shao <alisonshao@mac.lan> Co-authored-by: Mick <mickjagger19@icloud.com>
Motivation
Address #17779 — Upgrade
transformersto5.3.0.Changes
transformers>=5.2.0,huggingface_hub>=1.0.0; removehf_transferget_rope_config()utility for backward-compatibleconfig.rope_parametersaccesspadding_idx(transformers#41541)CLIPImageProcessorFastreturningtorch.Tensorinstead ofndarraypooler_outputinstead oflast_hidden_stateuse_fast=Truein auto mode; fixspecial_tokens_pattern; synctext_configAutoConfig; GGUF version parsing workaround_apply_rotary_embimport path; fix Qwen2.5-VL.visual→.model.visual.item()and missingall_tied_weights_keysfor v5 compatTODO
config.rope_parameters)padding_idxremovalCLIPImageProcessorFasttensor handlingpooler_outputuse_fast=Truedefault,special_tokens_patternfixInvalidVersion: 'N/A'workaround_apply_rotary_emb).visualmoved to.model.visualtorch.linspace().item()+ missingall_tied_weights_keysclean_up_tokenizationremoved in v5 — InternVL's HF Hub tokenizer (trust_remote_code) still calls it;TOKENIZER_MAPPING.registeris bypassed byauto_mapis_torch_fx_availableremoved — upstream model code (moonshotai) or sglang shimdiffusers— upstreamforward_batch_embeddingfails withbatch.input_ids=None(TypeError: object of type 'NoneType' has no len()inForwardBatch.init_new)AutoProcessorfails —ValueError: Unrecognized feature extractor; v5 can't resolve feature extractor for MiniCPM-o model typeaddict,matplotlibpackages in CI (not v5-related)is_deepseek_nsa()crashes on dicthf_text_config—AttributeError: 'dict' object has no attribute 'architectures'test_matryoshka_embedding: v5 respectsconfig.is_causal=false→ bidirectional attention in HF reference, but SGLang always uses causal